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ABSTRACT 

Minute difference in free energy cliange of unfold- 
ing among structures in an oligonucleotide sequence 
can lead to a complex population equilibrium, which 
is rather challenging for ensemble techniques to de- 
cipher. Herein, we introduce a new method, molecu- 
lar population dynamics (MPD), to describe the in- 
tricate equilibrium among non-B deoxyribonucleic 
acid (DNA) structures. Using mechanical unfolding 
in laser tweezers, we identified six DNA species in 
a cytosine (C)-rich bcl-2 promoter sequence. Popu- 
lation patterns of these species with and without a 
small molecule (IMC-76 or IMC-48) or the transcrip- 
tion factor hnRNP LL are compared to reveal the MPD 
of different species. With a pattern recognition algo- 
rithm, we found that IMC-48 and hnRNP LL share 
80% similarity in stabilizing i-motifs with 60 s incu- 
bation. In contrast, IMC-76 demonstrates an oppo- 
site behavior, preferring flexible DNA hairpins. With 
120-180 s incubation, IMC-48 and hnRNP LL desta- 
bilize i-motifs, which has been previously proposed 
to activate bcl-2 transcriptions. These results pro- 
vide strong support, from the population equilibrium 
perspective, that small molecules and hnRNP LL can 
modulate bcl-2 transcription through interaction with 
i-motifs. The excellent agreement with biochemical 
results firmly validates the MPD analyses, which, we 
expect, can be widely applicable to investigate com- 
plex equilibrium of biomacromolecules. 



INTRODUCTION 

Unlike proteins in which native structures are often the most 
stable conformation in an amino acid sequence (1), confor- 
mation polymorphism with similar stability is often seen in 
a deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) 
sequence (2-4). The disparity can be rationalized by dif- 
ferent organization strategies employed in these macro- 
molecules. The cooperative arrangement in proteins is a re- 
sult of intricate cross-talks within or between secondary or 
tertiary structures. Once the cross-talks are compromised, 
the overall architecture may collapse as neither secondary 
nor tertiary structures are stable as stand-alone units (5). 
Nucleic acid structures, however, are hierarchical (6-8). 
Even without the higher order interactions, stand-alone sec- 
ondary conformations are stable. Secondary or higher order 
structures in nucleic acids are stabilized by Watson-Crick 
(WC) base pairing in duplex strands (9) or Hoogsteen bond- 
ing in tetraplex strands (10-12). As energetic difference is 
small between different WC and Hoogsteen H-bonds, it be- 
comes possible that multiple structures may coexist in the 
same nucleic acid sequence (13,14). 

Not only different conformations exist in the same se- 
quence, but also their interconversions occur frequently 
(15-17). In the context of naturally existing duplex DNA, 
each complementary strand can host a different set of struc- 
tures. For example, in a sequence of more than four tracts 
of guanine (G)-rich repeats, multiple possibilities of G- 
quadruplex (GQ) (18) can form. Each GQ requires four 
tracts of G-repeats to fold into a stack of planar G-quartets, 
which consists of four guanines cross-linked by Hoogsteen 
bonding. In the complementary cytosine (C)-rich strand, 
i-motif structures can form (19). Similar to GQ, each i- 
motif requires four tracts of C-rich repeats to fold into a 
stack of hemiprotonated cytosinexytosine pairs, which are 
more stable in DNA than RNA (20,21). In promoter se- 
quences upstream of the transcriptional start site, negative 
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superhelicity generated by transcriptional firing (22) results 
in the potential of forming GQs and i-motifs in comple- 
mentary stands. In the case of the insulin promoter, these 
have been shown to be mutually exclusive (23). The situ- 
ation is more complex during the transcription, in which 
the nascent RNA strand may participate in the equilibrium 
by forming structures in the RNA strand or those between 
RNA and DNA strands (24,25). All these possibilities bring 
a high level of complexity in the investigation of nucleic 
acids structures. 

It is possible that only one or a few species are active in 
a biological process. To understand the biological roles of 
active nucleic acid structures, therefore, it is necessary to 
clarify the entire population equilibrium. Conventional ex- 
perimental techniques such as nuclear magnetic resonance. 
Circular Dichroism (CD) and X-ray provide detailed struc- 
tural information on pure species (26-28). However, when 
it comes to a mixture, these methods often fail to resolve 
individual structures due to their ensemble average nature. 
Single-molecule approaches can provide a unique advan- 
tage to decipher individual species in a complex mixture. 
They also have an excellent capability of probing dynamic 
processes (29), which is rather challenging for traditional 
methods due to their reduced temporal resolutions. These 
capabilities afford single-molecule techniques unparalleled 
perspectives to probe the population dynamics of a complex 
system. 

In ecological biology, population dynamics have been 
used to describe the change in biological populations due 
to processes such as immigration, emigration, birth or death 
(30). Factors such as climate or geographical locations can 
be dissected to reveal the specific effect on the population 
dynamics (31). The folding and unfolding of structures in 
a nucleotide sequence and the interconversion among these 
structures closely resemble the population change in bio- 
logical species. Therefore, we apply the concept of popu- 
lation dynamics to describe the effect of different factors, 
such as ligands and proteins, on the population pattern of 
multiple non-B DNA species that can form in a DNA frag- 
ment. To differentiate our approach from that used in biol- 
ogy, we name this method as molecular population dynam- 
ics (MPD). Compared to current ensemble approaches in 
which the influence of external factors on population equi- 
librium is described rather qualitatively, the MPD method 
allows a quantitative measurement, such as similarity and 
additivity (see below), to compare these factors in an intu- 
itive and straightforward fashion. 

In previous publications. Hurley and coworkers have de- 
scribed that i-motif structures in the bcl-2 promoter re- 
gion are transcriptional modulators (32,33). The popula- 
tion equilibrium in this system is highly complex. Not only 
multiple i-motif structures compete for folding in the frag- 
ment, 5'-CAG CCCCGC T CC C G C CCC CT T CCT CCC 
GCG CCC GCCCCT-3' (see Figure la), which contains six 
C-rich tracts for a minimal of 1 5 i-motifs, the binding of a 
transcription factor hnRNP LL or a small molecule (IMC- 
76 or IMC-48; see Supplementary Figure SI for structures) 
can influence the equilibrium as well. With a highly sensitive 
Population Z)econvolution at A^^anometer resolution method 
(or PoDNano) recently developed in our lab (13,34), we 
wish to decipher this complex equilibrium and identify rel- 
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Figure 1. Mechanical unfolding of the structures formed in a C-rich frag- 
ment of the bcl-2 promoter region by laser tweezers, (a) Location of the C- 
rich fragment in the upstream of the PI promoter in the bcl-2 gene, (b) The 
DNA construct that contains the C-rich fragment is sandwiched between 
two beads trapped by the laser tweezers, (c) A typical force-extension {F- 
X) curve. The red and black curves represent the stretching and relaxing 
processes, respectively, (d) Left: a plot of change in contour length (AL) 
versus force. The change in extension in (c) has been converted to the AL 
(see text). Right: histograms of folded (top) and unfolded populations (bot- 
tom). The black curve represents two-peak Gaussian fitting. 



evant populations responsible for the bcl-2 transcription 
from the perspective of MPD. 

First, we measured the population pattern of species in 
the bcl-2 promoter fragment in the presence of modulators 
IMC-76, IMC-48 or hnRNP LL with 60 s incubation. Af- 
ter comparing the population pattern without a modulator, 
the effect of each modulator on the MPD is revealed. We 
confirmed biochemical findings (32,33) that IMC-48 or hn- 
RNP LL stabilizes i-motif species over flexible DNA hair- 
pins while IMC-76 shows an opposite behavior. With a sim- 
ple pattern recognition algorithm, we estimated 80% simi- 
larity between the effects of IMC-48 and hnRNP LL. The 
similarity drops to 40% between IMC-76 and IMC-48, and 
30% between IMC-76 and hnRNP LL. A mixture of IMC- 
48 and hnRNP LL has an additive effect on the i-motif pop- 
ulation dynamics (100% in additive probability), suggesting 
that they stabilize i-motif structures through different sites. 
With 120-180 s incubation, hnRNP LL and IMC-48 show 
a destabilization effect on i-motif populations. These results 
suggest that hnRNP LL first binds to i-motif species, fol- 
lowed by the unfolding of these structures, which is consis- 
tent with those proposed for the activation of bcl-2 tran- 
scription (33). Overall, our MPD analyses provide strong 
support, at the level of population equilibrium, that small 
molecules (IMC-76 and IMC-48) and the transcription fac- 
tor hnRNP LL could modulate bcl-2 transcription through 
interaction with i-motif structures in the bcl-2 promoter re- 
gion. Such a scenario provides evidence that DNA species 
may modulate transcription in a fashion similar to that of 
translational control by riboswitches (35). 
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MATERIALS AND METHODS 

Unless particularly noted, all DNA oligonucleotides used 
in this study were purchased from Integrated DNA Tech- 
nologies (IDT, Coralville, lA, USA). All chemicals with 
>99% purity were purchased from VWR (West Chester, PA, 
USA). Enzymes were purchased from New England Bio- 
labs (NEB, Ipswich, MA, USA) and surface functionalized 
beads for the laser tweezers experiments were obtained from 
Spherotech (Lake Forest, IL, USA). The IMC-48, IMC-76 
and hnRNP LL were prepared as described in recently sub- 
mitted papers (32,33). 



Preparation of DNA constructs 

The DNA constructs used for single-molecule mechani- 
cal unfolding and refolding experiments in the bcl-2 pro- 
moter region were prepared by sandwiching the 5'-CAG 
CCC CGC TCC CGC CCC CTT CCT CCC GCG CCC 
GCC CCT-3' sequence between two double-stranded DNA 
(dsDNA) handles according to the procedures described 
previously (36). To reduce the interference from the DNA 
handles on the bcl-2 fragment, two deoxythymidines were 
added at each end of the bcl-2 fragment. Briefly, the 2690- 
bp dsDNA handle was prepared based on the pEGFP vec- 
tor (Clontech, Mountain View, CA, USA). The vector was 
first digested by Sad and EagI restriction enzymes and then 
purified with agarose gel. The Sad end was labeled with 
digoxigenin by terminal deoxynucleotidyl transferase. The 
2028-bp dsDNA handle was ampHfied from the pBR322 
plasmid by polymerase chain reaction (PGR). One end of 
the handle was labeled with biotin through a modified PGR 
primer. The other end with an Xbal restriction site was in- 
corporated through another PGR primer. The 2028-bp ds- 
DNA handle was digested with Xbal. A single-stranded 
DNA (ssDNA) target that contained the bcl-2 fragment (5'- 
CTA GAG GGT GTG AAA TAG GGG AGA GAT GGG 
TTG AGG GGG GGT GGG GGG GGG TTG GTG GGG 
GGG GGG GGG GTT GGG AGG AAG AGG TAG GGG 
AGG GGG TG-3') was hybridized with two other DNA oli- 
gos (5'-GGG ATG TGT GGG GTA TTT GAG AGG GT- 
3' and 5'-GGG GGA GGG GGT GGG GTA GGT GTT 
GGT GGG-3'), resulting in a dsDNA-ssDNA hybrid as- 
sembly with EagI and Xbal overhangs at the two ends and 
a single-stranded bcl-2 promoter fragment in the middle. Fi- 
nally, this dsDNA-ssDNA hybrid and the two dsDNA han- 
dles were ligated by T4 DNA ligase to obtain the final DN A 
construct. 



Single-molecule force-ramp assay 

To immobilize the DNA construct prepared above on the 
surface of anti-digoxigenin-antibody-coated polystyrene 
beads, 0.1 ng (3.5 x 10^'^ mol) of DNA was mixed with 
1 |jl1 of beads (2.10 |xm in diameter, 0.5% w/v) in 5 |jl1 of a 
10 mM phosphate buffer supplemented with 100 mM KGl 
either at pH 5.5 or 6.3. Since no significant difference has 
been found for ensemble experiments performed at pH 6.6 
and 6.3, we carried out single -molecule experiments in pH 
6.3 buffers, which allowed more formation of folded species. 
The mixture was diluted to 750 |jl1 with the same buffer af- 



ter incubation at room temperature for 30 min. The DNA- 
immobilized beads were then injected into a custom-made 
chamber and made ready for laser tweezers experiments. 

Home-built dual-trap 1064 nm laser tweezers were used 
to carry out the force-ramp assay at 23°G (37,38). One 
laser focus (mobile trap) grabbed the anti-digoxigenin- 
antibody-coated bead that had already been linked with the 
DNA construct, while another focus trapped a streptavidin- 
coated bead (1.87 |xm diameter). The mobile trap was con- 
trolled by a motorized mirror to move one bead close to or 
apart from another, which allows the tethering of the DNA 
construct between the two beads through affinity interac- 
tions. After the attachment of the DNA construct between 
the two beads, similar bead movements were carried out in 
the force-ramp assay with a loading rate of 5.5 pN/s (see 
text). 

To evaluate transcription modulators on MPD, 10 |jlM 
IMG-76 or IMG-48 (32) was incubated with the DNA con- 
struct for 60 s. For transcription factor hnRNP LL, 280 nM 
was used to incubate with the DNA construct for 60 s. To 
evaluate the temporal effect, 120-180 s incubation was used. 

RESULTS AND DISCUSSION 

DNA population pattern is obtained by a single-molecule me- 
chanical unfolding method 

Inspired by the finding that i-motif structures in the pro- 
moter region of bcl-2 gene can regulate Bcl-2 expression 
through the recognition of hnRNP LL, a protein that be- 
longs to a family with transcriptional control activities (33), 
we set out to evaluate different species involved in this regu- 
lation process at the single-molecular level by laser tweez- 
ers (37). With procedures described in the Materials and 
Methods section, we sandwiched the single-stranded G- 
rich strand with the sequence (Figure la), 5'-GAG GGG 
GGG TGG GGG GGG GTT GGT GGG GGG GGG GGG 
GGT-3', between the two dsDNA handles. The free ends 
of the dsDNA handles were immobilized to two optically 
trapped beads via digoxigenin-anti-digoxigenin-antibody 
and biotin-streptavidin interactions, respectively (Figure 
lb). By moving one of the trapped beads with a loading 
rate of 5.5 pN/s in a pH 5.5 phosphate buffer with 100- 
mM KGl at room temperature, mechanical unfolding ex- 
periments were carried out as tethered DNA was stretched 
below the plateau force (maximum 60 pN; Figure Ic). Un- 
folding event was indicated by a sudden decrease in ten- 
sion accompanied by an increase in extension in the force- 
extension {F-X) curves. By reversing the direction of the 
bead movement, tension in the DNA construct can be re- 
duced to 0 pN, allowing structures to refold. Subtraction of 
the stretching from the relaxing F-Xcwvq permits us to ob- 
tain the change-in-contour-length (AL) through the change 
in extension (A.x) between these two curves at a particular 
force (F) using the worm-like-chain model (34,39): 



Ax 

aZ 



= 1 



1 fknT 
2\ FP 



1/2 



F 



(1) 



where k-g, is the Boltzmann constant, 7" is the absolute tem- 
perature, P is the persistent length (51.95 nm) and S is the 
stretching modulus (1226 pN) for dsDNA handles (39). A 
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Figure 2. Effect of pH on the MPD of the C-rich structures. The AL his- 
tograms of C-rich populations at pH 5.5 (a) and pH 6.3 (b). The red and 
black bars depict AL populations measured at i^rupture and deconvoluted 
from the PoDNano method, respectively, (c) MPD obtained from the dif- 
ferential population pattern between (a) and (b) (see text). The background 
arrow shows the direction of the population shift. The change of bubble 
size in each string of bubbles depicts the direction of change in a specific 
population between two different conditions. Underscored percentage val- 
ues indicate those at pH 6.3 while the rest depict those at pH 5.5. The ran- 
dom coil structure is shown at the center. See Supplementary Figure S3 for 
details of other structures. 



typical plot of AL-i^ curve is shown in Figure Id. The AL 
obtained here reflects the size of a folded structure, and the 
rupture force (iVupime) at which unfolding occurs depicts its 
mechanical stability. Since only one AL can be obtained at 
-frupture (Figure Id), a histogram of AL(f@i.upture) only allows 
a rough estimation of the number of species in the DNA 
fragment (Figure 2a and b, red bars). To increase the ac- 
curacy, we expanded AL measurements to the force regions 
both below and above the L'rupture (Figure Id, left panel) and 
then constructed AL histograms for each region (Figure Id, 
right panel). From the difference of the Gaussian centers be- 
tween the two AL populations, the AL for a particular tran- 
sition can be determined rather accurately from the stretch- 
ing and relaxing F-Jf curves (Figure Id, right panel). 

The AL thus obtained represents a particular species 
formed in the DNA sequence. To deconvolute different pop- 
ulations in the bcl-2 promoter DNA, we used kernel density 
treatment followed by resampling statistical analysis (34). 
First, we expanded each AL value with a Gaussian ker- 
nel, the width of which is the average of the standard er- 
rors in the AL measurements immediately before and after 
the unfolding event (Figure Id, left panel). With a set of 
these Gaussian kernels, we then constructed a probability 
density distribution of populations from randomly selected 
Gaussian kernels in each of 1000 resamplings. Three pre- 
dominant populations in each probability distribution were 
grouped to construct a histogram (Figure 2a and b, black 
histograms). Due to the significant expansion of AL mea- 
surements by experimentally determined Gaussian kernels, 
this PoDNano approach affords ~0.5 nm spatial resolution 
in the population identification (34). With the information 
of the number and the size of species, we were able to es- 
timate the abundance of each population from the origi- 
nal AL histogram (see Supplementary Figure S2 for details) 
(38). The percentage of each species (Table 1), the number 
and the size of species constitute three major features of a 
population pattern (Figure 2a and b, black histograms). 



Flexible hairpin and i-motif species are present in the C-rich 
strand of the bcl-2 promoter 

The population pattern in Figure 2a reveals DNA species 
that involve 15, 21, 26 and full-length (>31) nucleotides 
(nts) in the C-rich fragment (see Supplementary Informa- 
tion for the conversion of AL to the number of nucleotides). 
These species have respective abundance of 5.4, 17.8, 28.5 
and 11.4% in a pH 5.5, 10 mM phosphate buffer with 100 
mM KCl (Table 1 and Figure 2c). Due to the hemiproto- 
nated nature in the intercalating cytosinexytosine pairs, i- 
motif structures are well known to be pH sensitive. To test 
whether these species can form i-motif structures, we re- 
peated the experiment at pH 6.3, a condition similar to that 
employed by Hurley and coworkers [see the Materials and 
Methods section and (33)]. Application of the PoDNano 
analysis reveals a quite different population pattern at this 
pH (Figure 2b and Table 1). While there is a new 9-nt species 
with 2.3% in population, the percentage population for the 
15-, 21-, 26-nt and full-length species is reduced to 0, 6.8, 
7.0 and 3.5%, respectively. Such a pH dependency suggests 
that the species larger than 15 nts contain i-motif structures. 
The appearance of the 9-nt species at pH 6.3, but not at pH 
5.5, suggests that it should not be an i-motif In fact, no i- 
motif structures known so far with less than 1 5 nts can form 
in this sequence. Instead, the 9-nt species could be a flexible 
hairpin that is stable at the higher pH (See Supplementary 
Figure S3 for a possible structure). 

To evaluate the effect of pH on the population distribu- 
tion, we subtracted the percentage of each population at 
pH 5.5 (Figure 2a, bottom panel and Table 1) from that 
at pH 6.3 (Figure 2b, bottom panel and Table 1). As most 
F-X curves contain only one rupture event, we argue that 
structures revealed by mechanical unfolding may not have 
intermediates, which would lead to more than one unfold- 
ing event. From topology perspective, it is not possible to 
interconvert between different i-motifs and flexible hairpins 
without unfolding to random coils first, although partially 
unfolded species could be generated without such intercon- 
version (40,41). For simplicity, we constructed an MPD dia- 
gram centered on the unfolded DNA fragment (Figure 2c). 
This diagram gives a clear visualization on the change of 
population with pH. For example, it clearly shows that the 
15-, 21-, 26-nt and the full-length species (>31 nts) reduce 
their populations at higher pH (Figure 2c) and therefore 
have i-motif elements in their structures. We assigned the 
full-length population as a fully folded i-motif structure (see 
Supplementary Figure S3 for a possible structure), since no 
stable flexible hairpin of similar size can be found in mfold® 
calculations (Supplementary Figure S4). Assuming i-motifs 
in this C-rich DNA fragment behave similarly with pH in- 
crease, we reasoned that species with population reduction 
no smaller than that of the full-length i-motif (60% in reduc- 
tion; Table 1) are likely to contain i-motif elements in their 
structures. Therefore, we assigned the 15-nt (5.4% 0% 
reduction) and 26-nt (75% in reduction) species as i-motifs 
[see Supplementary Figure S3 for possible structures; notice 
it is possible for the 26-nt to assume partially folded confor- 
mation that contains pH-sensitive hemiprotonated C: CH 
stackings (40)]. Although it is possible that the 22-nt species 
could be i-motif only, our experiments in a pH 5.5 buffer 
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Table 1. Percentage population of individual species under different conditions 



pH 


5.5 




6.3 






Time 




60s 




120 s 


180 s 



IMC-76 






+ 


+ 




+ 






+ 
















IMC-48 




+ 




+ 






+ 






+ 


+ 




+ 


+ 




+ 


hnRNP 
















+ 


+ 


+ 




+ 


+ 




+ 


+ 


<15 nt 


0.0 


0.0 


0.0 


0.0 


2.3 


8.7 


0.0 


2.4 


6.1 


1.9 


0.0 


10.9 


7.6 


0.0 


17.0 


5.7 


15 nt 


5.4 


10.3 


2.0 


8.6 


0.0 


0.0 


11.2 


0.0 


7.7 


7.0 


11.8 


0.0 


9.2 


10.3 


0.0 


5.3 


22 nt 


17.8 


0.0 


26.9 


17.9 


6.8 


2.4 


0.0 


4.7 


0.0 


0.0 


7.6 


5.9 


3.1 


7.5 


8.5 


5.3 


26 nt 


28.5 


29.1 


18.8 


24.3 


7.0 


5.2 


11.4 


15.5 


10.4 


40.5 


4.2 


4.9 


4.6 


4.6 


3.5 


6.5 


>32 nt 


11.4 


28.3 


2.9 


6.4 


3.5 


1.6 


4.3 


4.7 


1.2 


7.0 


1.4 


4.2 


3.1 


3.4 


1.0 


2.0 



indicated that IMC-76 increases the 22-nt population from 
17.8% to 26.9% (Table 1). Since IMC-76 should decrease i- 
motif populations while increasing those of flexible hairpins 
(32), the 22-nt could be a mixture of flexible hairpins and 
i-motifs (see Supplementary Figure S3 for possible struc- 
tures). Finally, due to the unique presence of the <15-nt 
species at pH 6.3, this structure has been assigned as flex- 
ible hairpins (see above and Supplementary Figure S3 for 
possible structures). 

MPD of C-rich DNA structures is modulated by small 
molecules and the transcription factor hnRNP LL 

With the assignment of different populations, we proceeded 
to quantify the effects of modulators identified by Hurley 
and coworkers (32,33) during the transcription of the bcl- 
2 gene from the perspective of MPD. First, we conducted 
mechanical unfolding experiments in the presence of IMC- 
76 to obtain a population pattern in the DNA fragment at 
pH 6.3 by the PoDNano approach (Figure 3a). This popu- 
lation pattern is then compared to that without ligand (Fig- 
ure 2b) to obtain the MPD controlled by the hgand IMC- 
76. As shown in Figure 3d, the populations of the 22-, 26-nt 
and the full-length species decrease whereas that of the flex- 
ible hairpin (<15-nt) increases. This result suggested that 
the IMC-76 ligand helps to stabilize the flexible hairpin and 
shifts the population equilibrium toward smaller species 
(see Figure 3 in (32) for binding assays). An opposite trend 
was observed when the IMC-48 was evaluated (Figure 3e). 
The populations of the 15-, 26-nt and the full-length species 
increase at the expense of both 22- and <15-nt species. As 
the former three species contain i-motif structural elements 
whereas the latter two flexible hairpins, it appears that IMC- 
48 stabilizes i-motifs over flexible hairpins. Similar results 
were observed in the presence of transcription factor hn- 
RNP LL, which decreases the 22-nt species while increas- 
ing the populations of the 26-nt and the full-length species 
(Figure 31). 

To quantitatively compare the effects of different factors 
on the MPD of DNA species in the bcl-2 promoter frag- 
ment, we designed a simple pattern recognition algorithm 
based on pairwise analysis (42). To facilitate the compari- 
son, first, we digitized the change in population of species 
i (Ci). We assigned C, = 1 for an increase in population; 
C, = —1 for a decrease; and C, — 0 for no change (Sup- 
plementary Tables SI and S5). Pairwise comparison of fac- 
tors 1 and 2 is carried out by the sum of the multiplica- 
tion of Ci values for all n species (^"=i {CyjCj.i)). Such a 



treatment gives a similarity score between n (identical pat- 
terns) and —n (totally opposite patterns). To compare fac- 
tors that affect different number of species, we converted 
the similarity score to percent similarity {s) by the expres- 
sion S= ( ""^^Hi'^''' '^' '' ) X 100%. This calculation gives s 

— 100% between identical MPD (positive correlation), s — 
0% for two totally opposite MPD (anti-correlation) and s — 
50% for no correlation. 

Using this algorithm, we quantified the similarity be- 
tween different bcl-2 gene modulators (Table 2 and Supple- 
mentary Table S2). We found that the percent similarity be- 
tween IMC-48 and hnRNP LL is 80%, demonstrating sim- 
ilar effects on the MPD between these two modulators. In 
comparison, the similarity percentage is 30% between IMC- 
76 and IMC-48 and 40% between IMC-76 and hnRNP LL, 
indicating that IMC-76 has an opposite effect on the MPD 
with respect to IMC-48 or hnRNP LL. These results agree 
very well with Hurley's finding that IMC-76 destabilizes i- 
motif structures, whereas IMC-48 and hnRNP LL stabilize 
these structures in biochemical assays (32,33). 

To evaluate whether there is an additive effect between 
different modulators, we performed the same mechanical 
unfolding experiments in the IMC-48 /hnRNP LL mixture 
(Figure 4) and in the IMC-76/hnRNP LL mixture (Supple- 
mentary Figure S5). Using the algorithm for pattern recog- 
nition described above, we compared the similarity percent- 
age of each modulator, or their mixture, with respect to 
IMC-48. The results are shown in a semi-circle similarity 
plot in Figure 5 (see Supplementary Information for the 
construction of the similarity plot). As expected, the effects 
of IMC-48, hnRNP LL and their mixture are highly corre- 
lated, while IMC-76 shows an anti-correlated behavior. In- 
terestingly, the mixture of two anti-correlated modulators 
(IMC-76 and hnRNP LL) shows a non-correlated behav- 
ior (s — 55%) with respect to IMC-48. Likewise, two anti- 
correlated small molecules IMC-48 and IMC-76 show a 
non-correlated behavior (s — 50%) when they mix together. 
These results suggest that there is an additive effect between 
different modulators. 

Further evidence for an additive effect comes from the 
MPD analyses. Figure 4a showed that the net effect of hn- 
RNP LL in the presence of IMC-48 is to further increase 
the population of the 26-nt and full-length i-motif species. 
Similarly, more i-motif structures (15-, 26-nt and full-length 
species) were formed due to the net effect of IMC-48 in the 
context of hnRNP LL (Figure 4c). 
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Figure 3. Effect of transcription modulators on the MPD of the C-rich structures at pH 6.3. Population patterns for IMC-76 (10 |xM) (a), IMC-48 (10 
|xM) (b) and hnRNP LL (280 nM) (c). The red and black bars depict AL populations measured at /'luptme and deconvoluted from the PoDNano method, 
respectively. MPD for IMC-76 (d), IMC-48 (e) and hnRNP LL (0- Each MPD is obtained after comparison of population patterns with [(a), (b) or (c)] and 
without (Figure 2b) a particular transcription modulator. The background arrow shows the direction of the population shift. The change of bubble size 
in each string of bubbles depicts the direction of change in a specific population between two different conditions. Underscored percentage values indicate 
those with modulators while the rest are those without ligands or proteins. The random coil structure is shown at the center. See Supplementary Figure S3 
for details of other structures. 

Table 2. Comparison of percent similarity in MPD between different factors with 60 s incubation 
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Figure 4. Combined effect of IMC-48 and hnRNP LL on the MPD of C-rich structures at pH 6.3. (a) Net effect of hnRNP LL (280 nM) in the presence of 
IMC-48 (10 (jlM). (b) Population patterns of the C-rich species in the presence of IMC-48 (10 |jlM) and hnRNP LL (280 nM) with 60 s incubation. The red 
and black bars depict AL populations measured at fmpture and those deconvoluted from the PodNano method, respectively, (c) Net effect of IMC-48 (10 
|xM) in the presence of hnRNP LL (280 nM). Each MPD is obtained after comparison of the population pattern with a particular modulator (Figure 3b 
or c) and that with the mixture of IMC-48 and hnRNP LL (b). Underscored percentage values indicate those with both modulators while the rest are those 
with only one modulator. The background arrow shows the direction of the population shift. The change of bubble size in each string of bubbles depicts 
the direction of a specific population shift. The random coil structure is shown at the center. See Supplementary Figure S3 for details of other structures. 
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Figure 5. Semi-circle similarity plot for different modulators and their 
mixtures. Percentage similarity for a specific modulator i (PSi, shown inside 
the gradient arrow) is obtained after comparison with IMC-48. The simi- 
larity between two modulators (i and j) can be calculated as s% = 100%- 
(PSi-PSj), herePSi>PSj. 



To quantify this additive effect, we evaluated the effect 
of the IMC-48 and hnRNP LL mixture with respect to 
IMC-48 or hnRNP LL. For each species in the population 
pattern, an additivity is confirmed when the effect of the 
mixture is no smaller than the combined effects of IMC- 
48 and hnRNP LL (see Supplementary Information and 
Supplementary Table S3-2). Out of the five species, four 
have shown this additivity, which is equivalent to 80% in 
the probability of additivity between IMC-48 and hnRNP 
LL. The probabihty of additivity rises to 100% for i-motif 
species (15-nt to full length). The high additivity suggests 
separate binding sites for IMC-48 and hnRNP LL. In com- 
parison, the probability of additivity between IMC-76 and 
hnRNP LL is 40% for all species and 50% for i-motifs (Sup- 
plementary Figure S5 and Supplementary Table S3-1); for 
IMC-48 and IMC-76, it is 60% and 50%, respectively (Sup- 
plementary Table S3-3). It is possible that the binding sites 
may partially overlap for these two pairs of modulators. 
The different additive effects between the IMC-76/hnRNP 
LL pair (50%) and the IMC-48/hnRNP LL pair (100%) 
have been well reproduced qualitatively in the binding assay 
performed by Hurley and coworkers [see Figure 5 in (33)], 
which validates the new method described here. Additional 
validation of the method came from the population anal- 
ysis of species in the human telomeric RNA (TERRA). It 
has been well estabHshed that TERRA GQ can be bound 
with ligand carboxypyridostatin (cPDS) or anti-GQ anti- 
body BG4 (43,44). Using the same MPD approach, indeed, 
we found 100% similarity between the effect of cPDS and 
BG4 on the structures formed in the TERRA (Supplemen- 
tary Figure S6). 

By using the similarity comparison and additivity calcu- 
lation in the new MPD method, we have quantified for the 
first time the effect of transcription modulators on the MPD 
of non-B DNA structures. With the in vivo presence and the 
biological activity of non-B DNA structures firmly estab- 
lished (43,45), we envision the method is instrumental to un- 
derstand biological implications of these non-B DNA struc- 
tures by quantitative evaluation of the population equilib- 
rium affected by specific cellular factors. 



Temporal effect on the MPD of the C-rich DNA structures 

It has been reported by Hurley and coworkers that initial 
binding of hnRNP LL to i-motif structures in the bcl-2 
promoter fragment eventually leads to the destabilization 
of i-motifs with longer incubation time (33). To illustrate 
this process from a perspective of molecular population dy- 
namics, we first mechanically unfolded DNA structures by 
laser tweezers. This was followed by incubation with differ- 
ent times (120-180 s versus 60 s) to obtain the temporal ef- 
fect on the MPD in the presence of IMC-48, hnRNP LL or 
both. 

As shown in Figure 6, with 180 s incubation, populations 
of species larger than 26 nts decrease whereas that of the 22- 
nt species increases. As the 26-nt and the full-length species 
contain i-motif elements whereas the 22-nt species is a mix- 
ture of i-motifs and flexible hairpins, this result suggested 
that IMC-48 and hnRNP LL destabilize i-motifs over flexi- 
ble hairpins with 180 s incubation. Using the pattern recog- 
nition algorithm discussed above (see Supplementary Ta- 
ble S5 for the scores), we found that the similarity between 
IMC-48 and hnRNP LL is 70% (Table 3 and Supplemen- 
tary Figure S7). In addition, the temporal effect of the IMC- 
48 and hnRNP LL mixture has 70% similarity with that of 
IMC-48 and 90% similarity with that of hnRNP LL (Ta- 
ble 3). As 50% similarity depicts non-correlation between 
two factors (see Figure 5), the similarity of 70% represents 
a moderately positive correlation between the IMC-48 and 
the hnRNP LL. Consistent with this, the additivity calcu- 
lation (see above) showed that with 180 s incubation, these 
two factors have a reduced probability of additivity among 
DNA species (60% for all species and 50% for i-motifs; Sup- 
plementary Table S6) compared to short-term incubation. 
To probe the temporal effect more accurately, we also per- 
formed experiments with 120 s incubation. As shown in Ta- 
ble 1 and Supplementary Figure S8, the trend of the popu- 
lation dynamics is well maintained within the experimental 
error of the measurement. 

It has been proposed by Hurley that hnRNP LL binds 
i-motif structures through the sequences CCCGC and 
CGCCC in the lateral loops. This is followed by the disas- 
sembly of the i-motif into an ssDNA bound with hnRNP 
LL, which activates the bcl-2 transcription (33). With suf- 
ficient template tension, the bound hnRNP LL is expected 
to be ejected from ssDNA (46). This event gives rise to a 
rupture transition in the F-X curve with small change -in- 
contour-length (AL), which is a result of releasing flexi- 
ble DNA segments between the two binding sequences for 
the hnRNP LL [see Figure 8 in (33)]. The observation of 
small rupture events (<13-nt transitions; Figure 6b, c, e and 
f), therefore, supports these sequential events involved in 
the activation of bcl-2 transcription. It is rather surprising 
that with long-term incubation, IMC-48 also destabilizes i- 
motif, although with a weaker effect. Such a result, how- 
ever, is in agreement with the observation that IMC-48 can 
activate bcl-2 transcription (32), presumably through desta- 
bilization of i-motif structures similar to the hnRNP LL 
(33). Based on the fact that small transitions (<13 nt; Fig- 
ure 6a and d) were not observed in the presence of IMC-48, 
the detailed mechanisms for the i-motif destabilization are 
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Figure 6. Temporal effect of IMC-48 and linRNP LL on the MPD of the C-rich structures at pH 6.3. Population patterns for IMC-48 (10 (jlM) (a), 
linRNP LL (280 nM) (b) and both (c). The red and black bars depict AL populations measured at ^l-upture and deconvoluted from the PoDNano method, 
respectively. MPD for IMC-48 (d), hnRNP LL (e) and both (f). Each MPD is obtained after comparison of the population patterns between 180 s [(a), 
(b) or (c)] and 60 s incubation (Figure 3b and c and Figure 4b, respectively). Underscored percentage values indicate those with 180 s incubation while 
the rest are those with 60 s incubation. The background arrow shows the direction of the population shift. The change of bubble size in each string of 
bubbles depicts the direction of change in a specific population between two different conditions. The random coil structure is shown at the center. See 
Supplementary Figure S3 for details of other structures. 



Table 3. Comparison of percent similarity in MPD (180 versus 60 s incubation) between different factors 





IMC-48 


hnRNP LL 


IMC-48+hnRNP LL 


IMC-48 


100% 






hnRNP LL 


70% 


100% 




IMC-48+hn RNP LL 


70% 


90% 


100% 



Note. The similarity is calculated from digitized change in population (see Supplementary Table S5). 



likely to be different between IMC-48 and hnRNP LL dur- 
ing long-term incubation. 

The potential effect of the i-motif and flexible DNA struc- 
tures formed in the bcl-2 promoter on the regulation of bcl- 
2 transcription and the small-molecule-induced MPD of 
these non-B DNA species closely resemble the function of 
riboswitches (35). Riboswitch segments in messenger RNA 
(mRNA) are known to assume versatile conformations de- 
pendent on the presence of small molecules, which are often 
effector molecules of the protein encoded by the mRNA. 
Different RNA structures in a riboswitch reach a popula- 
tion equilibrium, which then regulates the production of the 
encoded protein. Therefore, the MPD under the control of 
small molecules is highly important for a riboswitch to mod- 
ulate protein expression. The results presented here provide 
first evidence that, similar to a riboswitch for translational 
control, DNA structures, i-motifs in particular, may carry 
out transcriptional control via MPD modulated by small 
molecules. 



CONCLUSIONS 

By identifying six different populations in a C-rich bcl-2 
promoter fragment using mechanical unfolding strategy, we 
analyzed the population equilibrium of these species with 
a new method, MPD. With a simple algorithm for pattern 
recognition, we found IMC-48 and hnRNP LL share a sim- 
ilar effect on the stabilization of i-motifs over flexible DN A 
hairpins, whereas IMC-76 shows a reversed effect with 60 s 
incubation. Longer incubation (120-180 s) causes IMC-48 
and hnRNP LL to destabilize i-motifs. These results agree 
strikingly well with those observed in biochemical experi- 
ments, validating this new single-molecule method. Our re- 
sults also provide evidence, at the level of population equi- 
librium of non-B DNA species, that bcl-2 transcription can 
be modulated by i-motif populations under the control of 
small molecules and the transcription factor hnRNP LL. 
Together with the results from chemosensitization of cancer 
cells, promoter binding and mRNA production (32,33), the 
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evidence for the involvement of i-motif structures in the bcl- 
2 transcription becomes compelling. These findings start 
to shed light on the potential of non-B DNA structures as 
new transcriptional regulation elements through MPD con- 
trolled by small molecules and transcription factors. As a 
natural extension, we anticipate the MPD approach devel- 
oped here can be used as a new tool to delineate complex 
and dynamic population equilibrium among nucleic acid 
and protein structures. 
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